21 research outputs found

    UG^2: a Video Benchmark for Assessing the Impact of Image Restoration and Enhancement on Automatic Visual Recognition

    Full text link
    Advances in image restoration and enhancement techniques have led to discussion about how such algorithmscan be applied as a pre-processing step to improve automatic visual recognition. In principle, techniques like deblurring and super-resolution should yield improvements by de-emphasizing noise and increasing signal in an input image. But the historically divergent goals of the computational photography and visual recognition communities have created a significant need for more work in this direction. To facilitate new research, we introduce a new benchmark dataset called UG^2, which contains three difficult real-world scenarios: uncontrolled videos taken by UAVs and manned gliders, as well as controlled videos taken on the ground. Over 160,000 annotated frames forhundreds of ImageNet classes are available, which are used for baseline experiments that assess the impact of known and unknown image artifacts and other conditions on common deep learning-based object classification approaches. Further, current image restoration and enhancement techniques are evaluated by determining whether or not theyimprove baseline classification performance. Results showthat there is plenty of room for algorithmic innovation, making this dataset a useful tool going forward.Comment: Supplemental material: https://goo.gl/vVM1xe, Dataset: https://goo.gl/AjA6En, CVPR 2018 Prize Challenge: ug2challenge.or

    Meet-in-the-middle: Multi-scale upsampling and matching for cross-resolution face recognition

    Full text link
    In this paper, we aim to address the large domain gap between high-resolution face images, e.g., from professional portrait photography, and low-quality surveillance images, e.g., from security cameras. Establishing an identity match between disparate sources like this is a classical surveillance face identification scenario, which continues to be a challenging problem for modern face recognition techniques. To that end, we propose a method that combines face super-resolution, resolution matching, and multi-scale template accumulation to reliably recognize faces from long-range surveillance footage, including from low quality sources. The proposed approach does not require training or fine-tuning on the target dataset of real surveillance images. Extensive experiments show that our proposed method is able to outperform even existing methods fine-tuned to the SCFace dataset

    Recovery of superquadric parameters from range images using deep learning

    Get PDF
    With the recent advancements in deep neural computation, we devise a method to recover superquadric parameters from range images using a convolutional neural network. By training our simple, fullyconvolutional architecture on synthetic data images, containing a single superquadric, we achieve encouraging results. In a fixed rotation scenario, the model could already be used in practice, but we still need to improve on prediction of arbitrary rotational parameters in the future

    Recovery of superquadric parameters from range images using deep learning

    Get PDF
    With the recent advancements in deep neural computation, we devise a method to recover superquadric parameters from range images using a convolutional neural network. By training our simple, fullyconvolutional architecture on synthetic data images, containing a single superquadric, we achieve encouraging results. In a fixed rotation scenario, the model could already be used in practice, but we still need to improve on prediction of arbitrary rotational parameters in the future

    Segmentation and Recovery of Superquadric Models using Convolutional Neural Networks

    Get PDF
    In this paper we address the problem of representing 3D visual data with parameterized volumetric shape primitives. Specifically, we present a (two-stage) approach built around convolutional neural networks (CNNs) capable of segmenting complex depth scenes into the simpler geometric structures that can be represented with superquadric models. In the first stage, our approach uses a Mask RCNN model to identify superquadric-like structures in depth scenes and then fits superquadric models to the segmented structures using a specially designed CNN regressor. Using our approach we are able to describe complex structures with a small number of interpretable parameters. We evaluated the proposed approach on synthetic as well as real-world depth data and show that our solution does not only result in competitive performance in comparison to the state-of-the-art, but is able to decompose scenes into a number of superquadric models at a fraction of the time required by competing approaches. We make all data and models used in the paper available from https://lmi.fe.uni-lj.si/en/research/resources/sq-seg.Comment: 8 pages, in Computer Vision Winter Workshop, 202

    EFaR 2023: Efficient Face Recognition Competition

    Full text link
    This paper presents the summary of the Efficient Face Recognition Competition (EFaR) held at the 2023 International Joint Conference on Biometrics (IJCB 2023). The competition received 17 submissions from 6 different teams. To drive further development of efficient face recognition models, the submitted solutions are ranked based on a weighted score of the achieved verification accuracies on a diverse set of benchmarks, as well as the deployability given by the number of floating-point operations and model size. The evaluation of submissions is extended to bias, cross-quality, and large-scale recognition benchmarks. Overall, the paper gives an overview of the achieved performance values of the submitted solutions as well as a diverse set of baselines. The submitted solutions use small, efficient network architectures to reduce the computational cost, some solutions apply model quantization. An outlook on possible techniques that are underrepresented in current solutions is given as well.Comment: Accepted at IJCB 202

    AUTOMATED FACE RECOGNITION FROM LOW-RESOLUTION IMAGERY

    Full text link
    V pričujoči doktorski disertaciji se ukvarjamo s problemom samodejnega razpoznavanja obrazov iz slik nizke ločljivosti z uporabo metod globokega učenja. Metode globokega učenja so v zadnjem času dosegle močan preboj v učinkovitosti delovanja postopkov razpoznavanja obrazov. Globoki nevronski modeli so naučeni za razpoznavanje obrazov na podatkovnih zbirkah več milijonov slik in so že na podlagi raznolikosti slik v učnih podatkovnih zbirkah zmožni delovanja v režimih, kot so spremembe svetlosti, obrazne poze in mimike, za razliko od klasičnih pristopov k razpoznavanju obrazov, kjer so vplivi takih dejavnikov eksplicitno modelirani. Kljub preboju z globokim učenjem pa samodejni sistemi za razpoznavanje obrazov v nekaterih okoliščinah še vedno ne dosegajo človeških sposobnosti. Ena od takih okoliščin je nizka ločljivost slik obrazov, ki je lahko rezultat bodisi zajema slik s kamerami nizke kakovosti bodisi razdalje obraza od kamere. V disertaciji najprej izvedemo sistematično študijo vplivov dejavnikov kakovosti slik na sposobnost samodejnih sistemov razpoznavanja obrazov, kjer ugotovimo obstoj močnega vpliva ločljivosti slike na uspešnost razpoznavanja. Nato razvijemo metodo za izboljšavo kakovosti slik, ki temelji na novi arhitekturi konvolucijskega nevronskega omrežja za superresolucijo in novi kriterijski funkciji za superresolucijo obrazov, ki upošteva kakovost rekonstrukcije in vsebnost informacije o identiteti. V eksperimentih pri primerjavi s konkurenčnimi modeli za izboljšavo kakovosti obraznih slik ugotovimo, da ima razvit model boljšo sposobnost rekonstrukcije podrobnosti v visoki ločljivosti in je bolj uporaben za višjenivojske naloge računalniškega vida, kot sta razpoznavanje obrazov in lokalizacija ključnih obraznih točk. Na podlagi razvitega modela izvedemo študijo pristranskosti superresolucijskih modelov in ugotovimo, da vsi preizkušeni modeli izkazujejo izrazito pristranskost v prid modelu degradacije slike, uporabljenemu za generiranje učne podatkovne zbirke za učenje superresolucije. Zaradi te pristranskosti nobeden izmed preizkušenih modelov za izboljšavo kakovosti obraznih slik ni sposoben sistematično izboljšati slik z vidika uporabnosti za razpoznavanje obrazov, kadar gre za realne slike nizke ločljivosti in ne umetno podvzorčene. Na podlagi te ugotovitve razvijemo novo metodo za razpoznavanje obrazov iz slik nizke ločljivosti, ki temelji na rezultatih prej razvitega modela za izboljšavo kakovosti slik nizke ločljivosti. Metoda temelji na združevanju rezultatov superresolucije na več skalah in izpeljavi značilk s prednaučenimi modeli za razpoznavanje obrazov. Z eksperimenti na podatkovni zbirki SCFace pokažemo, da razvita metoda uspešno izrabi s strani modelov za izboljšavo kakovosti slik dodano informacijo in izboljša rezultate razpoznavanja obrazov.Recently, significant advances in the field of automated face recognition have been achieved using computer vision, machine learning, and deep learning methodologies. However, despite claims of super-human performance of face recognition algorithms on select key benchmark tasks, there remain several open problems that preclude the general replacement of human face recognition work with automated systems. State-of-the-art automated face recognition systems based on deep learning methods are able to achieve high accuracy when the face images they are tasked with recognizing subjects from are of sufficiently high quality. However, low image resolution remains one of the principal obstacles to face recognition systems, and their performance in the low-resolution regime is decidedly below human capabilities. In this PhD thesis, we present a systematic study of modern automated face recognition systems in the presence of image degradation in various forms. Based on our findings, we then propose a novel technique for improving the quality of low-resolution face images. Specifically, we present a novel deep learning model architecture for image superresolution, and a novel training procedure for face hallucination that trains the model to super-resolve face images in a manner that preserves the information about the subject identity present in the low-resolution image. We validate the model by comparing its image reconstruction capability against several state-of-the-art models, as well as its performance on downstream semantic tasks including face recognition and face landmark localization. Next, we study the generalization capabilities of super-resolution-based face hallucination models, and find most of the models studied to be heavily biased towards the articial image degradation process used to generate their training datasets. We notice that due to this bias, none of the face hallucination models considered are able to outperform an interpolation baseline on face recognition benchmarks with real-life low resolution images. To overcome this problem, we then develop a novel method for face recognition from low-resolution images that uses the results of multi-scale face hallucination models developed earlier. The proposed method is able to benefit from the high-resolution information added by the face hallucination models without suffering from the training set bias they exhibit, and systematically outperform the interpolation baseline and other state-of-the-art low-resolution face recognition models on the SCFace benchmark. Our proposed methods are trained on large face image datasets in a manner typical for deep learning models. However, the resulting trained models are useful for face recognition applications in an open-set regime, and do not need to be re-trained for novel subjects
    corecore